03. Quiz: MC Control Methods

Quiz: MC Control Methods

In this lesson, we'll work with a simple gridworld example to illustrate the main ideas. The gridworld is identical to the environment that we examined when learning about Monte Carlo (MC) methods. Please watch the next video to refresh your memory.

## Video

L602 Gridworld Example RENDER V2-2

Before learning about Temporal-Difference control methods, check your knowledge of Constant-\alpha MC control by watching the video below.

## Video

Quiz: MC Control Methods

## Quiz

Say that an agent is learning to navigate the gridworld described in the above videos. Suppose the agent is using Constant-\alpha MC control in its search for the optimal policy, with \alpha=0.1. At the end of the 99th episode, the Q-table has the following values:

Q-table

Q-table

Say that the 100th episode is printed below.

100th episode

100th episode

What is the new value for the entry in the Q-table corresponding to state 1 and action right?

SOLUTION: 6.2